Overview

Dataset statistics

Number of variables25
Number of observations3953
Missing cells1047
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory772.2 KiB
Average record size in memory200.0 B

Variable types

CAT14
NUM10
BOOL1

Reproduction

Analysis started2020-05-21 07:22:13.042151
Analysis finished2020-05-21 07:22:34.800182
Duration21.76 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

name has a high cardinality: 3682 distinct values High cardinality
email_id has a high cardinality: 3373 distinct values High cardinality
university has a high cardinality: 3140 distinct values High cardinality
zip_code has a high cardinality: 615 distinct values High cardinality
funded_amnt_inv is highly correlated with loan_amnt and 1 other fieldsHigh correlation
loan_amnt is highly correlated with funded_amnt_inv and 1 other fieldsHigh correlation
installment is highly correlated with loan_amnt and 1 other fieldsHigh correlation
sub_grade is highly correlated with gradeHigh correlation
grade is highly correlated with sub_gradeHigh correlation
name has 271 (6.9%) missing values Missing
email_id has 580 (14.7%) missing values Missing
gender has 78 (2.0%) missing values Missing
university has 118 (3.0%) missing values Missing
name is uniformly distributed Uniform
email_id is uniformly distributed Uniform
university is uniformly distributed Uniform
dt_applied has unique values Unique
delinq_2yrs has 3628 (91.8%) zeros Zeros
inq_last_6mths has 1822 (46.1%) zeros Zeros
revol_bal has 42 (1.1%) zeros Zeros

Variables

name
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct count3682
Unique (%)100.0%
Missing271
Missing (%)6.9%
Memory size30.9 KiB
Skipper Artinstall
 
1
Collie Gert
 
1
Mikol Salvage
 
1
Tilda Clynter
 
1
Gwen Lorenzin
 
1
Other values (3677)
3677
ValueCountFrequency (%) 
Skipper Artinstall1< 0.1%
 
Collie Gert1< 0.1%
 
Mikol Salvage1< 0.1%
 
Tilda Clynter1< 0.1%
 
Gwen Lorenzin1< 0.1%
 
Ardyth Garvagh1< 0.1%
 
Lotti Joutapavicius1< 0.1%
 
Mortie Gladdifh1< 0.1%
 
Tonia Rooke1< 0.1%
 
Dannie Maier1< 0.1%
 
Other values (3672)367292.9%
 
(Missing)2716.9%
 

Length

Max length23
Median length14
Mean length13.27649886
Min length3

email_id
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct count3373
Unique (%)100.0%
Missing580
Missing (%)14.7%
Memory size30.9 KiB
ndureden5i@lulu.com
 
1
jstothert3t@prlog.org
 
1
lblenkinshipn5@jiathis.com
 
1
wcordeaucx@blogger.com
 
1
bmanicombae@timesonline.co.uk
 
1
Other values (3368)
3368
ValueCountFrequency (%) 
ndureden5i@lulu.com1< 0.1%
 
jstothert3t@prlog.org1< 0.1%
 
lblenkinshipn5@jiathis.com1< 0.1%
 
wcordeaucx@blogger.com1< 0.1%
 
bmanicombae@timesonline.co.uk1< 0.1%
 
jpolotti9i@dedecms.com1< 0.1%
 
ohedde2j@mozilla.com1< 0.1%
 
hmitroviclv@telegraph.co.uk1< 0.1%
 
rheaphyjd@mlb.com1< 0.1%
 
gaberhart9@mozilla.com1< 0.1%
 
Other values (3363)336385.1%
 
(Missing)58014.7%
 

Length

Max length35
Median length21
Mean length19.06982039
Min length3

gender
Categorical

MISSING

Distinct count2
Unique (%)0.1%
Missing78
Missing (%)2.0%
Memory size30.9 KiB
Male
1970
Female
1905
ValueCountFrequency (%) 
Male197049.8%
 
Female190548.2%
 
(Missing)782.0%
 

Length

Max length6
Median length4
Mean length4.944093094
Min length3

dt_applied
Categorical

UNIQUE

Distinct count3953
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
03/01/82
 
1
23/04/81
 
1
25/11/83
 
1
25/08/85
 
1
12/04/89
 
1
Other values (3948)
3948
ValueCountFrequency (%) 
03/01/821< 0.1%
 
23/04/811< 0.1%
 
25/11/831< 0.1%
 
25/08/851< 0.1%
 
12/04/891< 0.1%
 
13/05/901< 0.1%
 
21/11/861< 0.1%
 
01/05/871< 0.1%
 
01/03/841< 0.1%
 
21/02/811< 0.1%
 
Other values (3943)394399.7%
 

Length

Max length8
Median length8
Mean length8
Min length8

university
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct count3140
Unique (%)81.9%
Missing118
Missing (%)3.0%
Memory size30.9 KiB
Christchurch Polytechnic Institute of Technology
 
4
Universidad de Congreso
 
4
Jiangxi University of Traditional Chinese Medicine
 
4
Abant Izzet Baysal University
 
4
Fukuoka Institute of Technology
 
4
Other values (3135)
3815
ValueCountFrequency (%) 
Christchurch Polytechnic Institute of Technology40.1%
 
Universidad de Congreso40.1%
 
Jiangxi University of Traditional Chinese Medicine40.1%
 
Abant Izzet Baysal University40.1%
 
Fukuoka Institute of Technology40.1%
 
Phillips Graduate Institute40.1%
 
Stavropol State Technical University40.1%
 
Arab Open University40.1%
 
Universidad Tecnológica de México40.1%
 
Carlow College40.1%
 
Other values (3130)379596.0%
 
(Missing)1183.0%
 

Length

Max length114
Median length28
Mean length29.67088287
Min length3

loan_amnt
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count434
Unique (%)11.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13017.499367568935
Minimum1000
Maximum35000
Zeros0
Zeros (%)0.0%
Memory size30.9 KiB

Quantile statistics

Minimum1000
5-th percentile3000
Q16500
median12000
Q317625
95-th percentile30000
Maximum35000
Range34000
Interquartile range (IQR)11125

Descriptive statistics

Standard deviation8155.330342
Coefficient of variation (CV)0.6264897821
Kurtosis0.3258532123
Mean13017.49937
Median Absolute Deviation (MAD)5500
Skewness0.9233128761
Sum51458175
Variance66509412.98
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
120003158.0%
 
100002596.6%
 
150001904.8%
 
200001744.4%
 
60001654.2%
 
50001533.9%
 
350001433.6%
 
80001243.1%
 
16000992.5%
 
25000972.5%
 
Other values (424)223456.5%
 
ValueCountFrequency (%) 
1000210.5%
 
11001< 0.1%
 
120090.2%
 
130020.1%
 
13251< 0.1%
 
ValueCountFrequency (%) 
350001433.6%
 
344751< 0.1%
 
3400020.1%
 
339501< 0.1%
 
3360020.1%
 

funded_amnt_inv
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count828
Unique (%)20.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12809.792160966355
Minimum750.0
Maximum35000.0
Zeros0
Zeros (%)0.0%
Memory size30.9 KiB

Quantile statistics

Minimum750
5-th percentile3000
Q16500
median11775
Q317000
95-th percentile29735
Maximum35000
Range34250
Interquartile range (IQR)10500

Descriptive statistics

Standard deviation7935.907682
Coefficient of variation (CV)0.619518848
Kurtosis0.3951370723
Mean12809.79216
Median Absolute Deviation (MAD)5275
Skewness0.9263171893
Sum50637108.41
Variance62978630.74
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
120002496.3%
 
100002225.6%
 
60001533.9%
 
50001433.6%
 
150001393.5%
 
80001132.9%
 
7000872.2%
 
3000741.9%
 
20000721.8%
 
14000641.6%
 
Other values (818)263766.7%
 
ValueCountFrequency (%) 
7501< 0.1%
 
1000200.5%
 
11001< 0.1%
 
120090.2%
 
130020.1%
 
ValueCountFrequency (%) 
35000370.9%
 
34997.352451< 0.1%
 
34993.655391< 0.1%
 
34987.984521< 0.1%
 
34987.271011< 0.1%
 

term
Categorical

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
36 months
2687
60 months
1266
ValueCountFrequency (%) 
36 months268768.0%
 
60 months126632.0%
 

Length

Max length10
Median length10
Mean length10
Min length10

int_rate
Real number (ℝ≥0)

Distinct count35
Unique (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1296908676954212
Minimum0.06
Maximum0.24100000000000002
Zeros0
Zeros (%)0.0%
Memory size30.9 KiB

Quantile statistics

Minimum0.06
5-th percentile0.066
Q10.099
median0.127
Q30.16
95-th percentile0.203
Maximum0.241
Range0.181
Interquartile range (IQR)0.061

Descriptive statistics

Standard deviation0.04160931484
Coefficient of variation (CV)0.3208345782
Kurtosis-0.6951924625
Mean0.1296908677
Median Absolute Deviation (MAD)0.033
Skewness0.226416223
Sum512.668
Variance0.001731335081
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.1173248.2%
 
0.1272596.6%
 
0.0792596.6%
 
0.1242546.4%
 
0.1352315.8%
 
0.1432265.7%
 
0.1072135.4%
 
0.0992115.3%
 
0.0891985.0%
 
0.061604.0%
 
Other values (25)161840.9%
 
ValueCountFrequency (%) 
0.061604.0%
 
0.0661563.9%
 
0.0751373.5%
 
0.0792596.6%
 
0.0891985.0%
 
ValueCountFrequency (%) 
0.24120.1%
 
0.23960.2%
 
0.23560.2%
 
0.23140.1%
 
0.22760.2%
 

installment
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count1923
Unique (%)48.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean375.2073362003542
Minimum32.23
Maximum1283.5
Zeros0
Zeros (%)0.0%
Memory size30.9 KiB

Quantile statistics

Minimum32.23
5-th percentile93.88
Q1205.86
median336
Q3494.59
95-th percentile813.626
Maximum1283.5
Range1251.27
Interquartile range (IQR)288.73

Descriptive statistics

Standard deviation220.261152
Coefficient of variation (CV)0.5870385006
Kurtosis0.8900854243
Mean375.2073362
Median Absolute Deviation (MAD)140.06
Skewness0.9837168213
Sum1483194.6
Variance48514.9751
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
330.76270.7%
 
396.92250.6%
 
325.74220.6%
 
386.7210.5%
 
339.31200.5%
 
322.25190.5%
 
334.16190.5%
 
343.09180.5%
 
190.52180.5%
 
368.45170.4%
 
Other values (1913)374794.8%
 
ValueCountFrequency (%) 
32.231< 0.1%
 
32.5820.1%
 
33.0820.1%
 
33.551< 0.1%
 
33.9430.1%
 
ValueCountFrequency (%) 
1283.51< 0.1%
 
1276.61< 0.1%
 
1269.731< 0.1%
 
1243.851< 0.1%
 
1222.031< 0.1%
 

grade
Categorical

HIGH CORRELATION

Distinct count7
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
B
1262
A
908
C
811
D
510
E
313
Other values (2)
 
149
ValueCountFrequency (%) 
B126231.9%
 
A90823.0%
 
C81120.5%
 
D51012.9%
 
E3137.9%
 
F1253.2%
 
G240.6%
 

Length

Max length1
Median length1
Mean length1
Min length1

sub_grade
Categorical

HIGH CORRELATION

Distinct count35
Unique (%)0.9%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
B3
 
324
B5
 
260
A4
 
259
B4
 
254
C1
 
231
Other values (30)
2625
ValueCountFrequency (%) 
B33248.2%
 
B52606.6%
 
A42596.6%
 
B42546.4%
 
C12315.8%
 
C22275.7%
 
B22135.4%
 
B12115.3%
 
A51985.0%
 
A11584.0%
 
Other values (25)161840.9%
 

Length

Max length2
Median length2
Mean length2
Min length2

home_ownership
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
RENT
2081
MORTGAGE
1577
OWN
 
295
ValueCountFrequency (%) 
RENT208152.6%
 
MORTGAGE157739.9%
 
OWN2957.5%
 

Length

Max length8
Median length4
Mean length5.521123198
Min length3

annual_inc
Real number (ℝ≥0)

Distinct count813
Unique (%)20.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66175.9735365545
Minimum8280.0
Maximum550000.0
Zeros0
Zeros (%)0.0%
Memory size30.9 KiB

Quantile statistics

Minimum8280
5-th percentile25000
Q140100
median57000
Q380000
95-th percentile135880
Maximum550000
Range541720
Interquartile range (IQR)39900

Descriptive statistics

Standard deviation40498.80417
Coefficient of variation (CV)0.6119865264
Kurtosis18.71426089
Mean66175.97354
Median Absolute Deviation (MAD)18000
Skewness3.058200935
Sum261593623.4
Variance1640153139
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
600001543.9%
 
500001493.8%
 
750001203.0%
 
400001203.0%
 
450001142.9%
 
70000962.4%
 
80000932.4%
 
30000932.4%
 
65000882.2%
 
35000822.1%
 
Other values (803)284471.9%
 
ValueCountFrequency (%) 
82801< 0.1%
 
84001< 0.1%
 
96001< 0.1%
 
99601< 0.1%
 
100001< 0.1%
 
ValueCountFrequency (%) 
5500001< 0.1%
 
5250001< 0.1%
 
4080001< 0.1%
 
40000020.1%
 
3650001< 0.1%
 
Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
Verified
1515
Not Verified
1247
Source Verified
1191
ValueCountFrequency (%) 
Verified151538.3%
 
Not Verified124731.5%
 
Source Verified119130.1%
 

Length

Max length15
Median length12
Mean length11.37085758
Min length8
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
0
3275
1
678
ValueCountFrequency (%) 
0327582.8%
 
167817.2%
 

purpose
Categorical

Distinct count13
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
debt_consolidation
2102
credit_card
792
other
 
297
home_improvement
 
196
small_business
 
145
Other values (8)
421
ValueCountFrequency (%) 
debt_consolidation210253.2%
 
credit_card79220.0%
 
other2977.5%
 
home_improvement1965.0%
 
small_business1453.7%
 
major_purchase1002.5%
 
car902.3%
 
wedding631.6%
 
medical521.3%
 
moving391.0%
 
Other values (3)771.9%
 

Length

Max length18
Median length18
Mean length14.28307614
Min length3

zip_code
Categorical

HIGH CARDINALITY

Distinct count615
Unique (%)15.6%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
900xx
 
55
606xx
 
55
100xx
 
54
112xx
 
50
945xx
 
49
Other values (610)
3690
ValueCountFrequency (%) 
900xx551.4%
 
606xx551.4%
 
100xx541.4%
 
112xx501.3%
 
945xx491.2%
 
070xx451.1%
 
331xx441.1%
 
750xx411.0%
 
300xx411.0%
 
113xx401.0%
 
Other values (605)347988.0%
 

Length

Max length5
Median length5
Mean length5
Min length5

add_state
Categorical

Distinct count43
Unique (%)1.1%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
CA
729
NY
 
372
FL
 
304
TX
 
273
NJ
 
181
Other values (38)
2094
ValueCountFrequency (%) 
CA72918.4%
 
NY3729.4%
 
FL3047.7%
 
TX2736.9%
 
NJ1814.6%
 
IL1553.9%
 
GA1463.7%
 
PA1363.4%
 
VA1303.3%
 
OH1243.1%
 
Other values (33)140335.5%
 

Length

Max length2
Median length2
Mean length2
Min length2

dti
Real number (ℝ≥0)

Distinct count1961
Unique (%)49.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.428287376675943
Minimum0.0
Maximum29.85
Zeros3
Zeros (%)0.1%
Memory size30.9 KiB

Quantile statistics

Minimum0
5-th percentile3.932
Q19.58
median14.45
Q319.47
95-th percentile24.214
Maximum29.85
Range29.85
Interquartile range (IQR)9.89

Descriptive statistics

Standard deviation6.378445753
Coefficient of variation (CV)0.4420792008
Kurtosis-0.7703420751
Mean14.42828738
Median Absolute Deviation (MAD)4.94
Skewness-0.04903565752
Sum57035.02
Variance40.68457022
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
11.890.2%
 
18.6380.2%
 
20.8880.2%
 
9.6570.2%
 
12.4870.2%
 
18.8470.2%
 
17.6770.2%
 
16.470.2%
 
19.6370.2%
 
16.270.2%
 
Other values (1951)387998.1%
 
ValueCountFrequency (%) 
030.1%
 
0.0220.1%
 
0.071< 0.1%
 
0.21< 0.1%
 
0.251< 0.1%
 
ValueCountFrequency (%) 
29.851< 0.1%
 
29.831< 0.1%
 
29.731< 0.1%
 
29.721< 0.1%
 
29.631< 0.1%
 

delinq_2yrs
Real number (ℝ≥0)

ZEROS

Distinct count6
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.10852517075638755
Minimum0
Maximum6
Zeros3628
Zeros (%)91.8%
Memory size30.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4087983222
Coefficient of variation (CV)3.766852606
Kurtosis32.99870086
Mean0.1085251708
Median Absolute Deviation (MAD)0
Skewness4.954297207
Sum429
Variance0.1671160683
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0362891.8%
 
12466.2%
 
2611.5%
 
3130.3%
 
440.1%
 
61< 0.1%
 
ValueCountFrequency (%) 
0362891.8%
 
12466.2%
 
2611.5%
 
3130.3%
 
440.1%
 
ValueCountFrequency (%) 
61< 0.1%
 
440.1%
 
3130.3%
 
2611.5%
 
12466.2%
 

inq_last_6mths
Real number (ℝ≥0)

ZEROS

Distinct count9
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8555527447508222
Minimum0
Maximum8
Zeros1822
Zeros (%)46.1%
Memory size30.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.997025005
Coefficient of variation (CV)1.165357731
Kurtosis2.163689287
Mean0.8555527448
Median Absolute Deviation (MAD)1
Skewness1.26526022
Sum3382
Variance0.9940588606
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0182246.1%
 
1124531.5%
 
258414.8%
 
32656.7%
 
4210.5%
 
5100.3%
 
630.1%
 
720.1%
 
81< 0.1%
 
ValueCountFrequency (%) 
0182246.1%
 
1124531.5%
 
258414.8%
 
32656.7%
 
4210.5%
 
ValueCountFrequency (%) 
81< 0.1%
 
720.1%
 
630.1%
 
5100.3%
 
4210.5%
 

pub_rec
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.9 KiB
0
3831
1
 
120
2
 
2
ValueCountFrequency (%) 
0383196.9%
 
11203.0%
 
220.1%
 

Length

Max length1
Median length1
Mean length1
Min length1

revol_bal
Real number (ℝ≥0)

ZEROS

Distinct count3672
Unique (%)92.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14367.447508221603
Minimum0
Maximum140967
Zeros42
Zeros (%)1.1%
Memory size30.9 KiB

Quantile statistics

Minimum0
5-th percentile1240.4
Q16352
median11449
Q318151
95-th percentile35148.4
Maximum140967
Range140967
Interquartile range (IQR)11799

Descriptive statistics

Standard deviation13468.63453
Coefficient of variation (CV)0.937441012
Kurtosis18.01764983
Mean14367.44751
Median Absolute Deviation (MAD)5657
Skewness3.322035836
Sum56794520
Variance181404116.1
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0421.1%
 
803230.1%
 
631430.1%
 
1484830.1%
 
1098030.1%
 
1133830.1%
 
1518330.1%
 
835730.1%
 
656530.1%
 
1303430.1%
 
Other values (3662)388498.3%
 
ValueCountFrequency (%) 
0421.1%
 
31< 0.1%
 
61< 0.1%
 
81< 0.1%
 
161< 0.1%
 
ValueCountFrequency (%) 
1409671< 0.1%
 
1319491< 0.1%
 
1309201< 0.1%
 
1247441< 0.1%
 
1234161< 0.1%
 

total_paymnt
Real number (ℝ≥0)

Distinct count3710
Unique (%)93.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14435.064318165443
Minimum0.0
Maximum58886.47343
Zeros2
Zeros (%)0.1%
Memory size30.9 KiB

Quantile statistics

Minimum0
5-th percentile2401.064047
Q16614.78722
median11907.35
Q319190.68001
95-th percentile35788.92425
Maximum58886.47343
Range58886.47343
Interquartile range (IQR)12575.89279

Descriptive statistics

Standard deviation10492.53033
Coefficient of variation (CV)0.7268779753
Kurtosis1.593830926
Mean14435.06432
Median Absolute Deviation (MAD)5937.176941
Skewness1.261678967
Sum57061809.25
Variance110093192.6
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
14288.7616980.2%
 
13148.1378670.2%
 
11907.3473270.2%
 
12029.4570.2%
 
11600.9860.2%
 
14288.7750.1%
 
11726.3250.1%
 
10956.7759650.1%
 
9011.55749450.1%
 
13263.9650.1%
 
Other values (3700)389398.5%
 
ValueCountFrequency (%) 
020.1%
 
91.391< 0.1%
 
151.81< 0.1%
 
165.371< 0.1%
 
203.551< 0.1%
 
ValueCountFrequency (%) 
58886.473431< 0.1%
 
58133.31991< 0.1%
 
58090.952071< 0.1%
 
58071.199821< 0.1%
 
58071.199771< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

nameemail_idgenderdt_applieduniversityloan_amntfunded_amnt_invtermint_rateinstallmentgradesub_gradehome_ownershipannual_incverification_statusloan_writeoffpurposezip_codeadd_statedtidelinq_2yrsinq_last_6mthspub_recrevol_baltotal_paymnt
0Calley Gironcgiron0@ehow.comFemale01/01/81Warner Southern College50004975.036 months0.107162.87BB2RENT24000.0Verified0credit_card860xxAZ27.65010136485863.155187
1Linus Studlstud1@washington.eduMale02/01/81Shri Lal Bahadur Shastri Rashtriya Sanskrit Vidyapeetha25002500.060 months0.15359.83CC4RENT30000.0Source Verified1car309xxGA1.0005016871014.530000
2Lorelle Ambagelambage2@wix.comFemale03/01/81Technische Universität Bergakademie Freiberg24002400.036 months0.16084.33CC5RENT12252.0Not Verified0small_business606xxIL8.7202029563005.666844
3Anna-diane Larratalarrat3@economist.comFemale04/01/81Divine Word College of Legazpi1000010000.036 months0.135339.31CC1RENT49200.0Source Verified0other917xxCA20.00010559812231.890000
4Gill RuskeNaNFemale05/01/81East China Jiao Tong University30003000.060 months0.12767.79BB5RENT80000.0Source Verified0other972xxOR17.94000277834066.908161
5Evelyn MacFaulemacfaul5@theatlantic.comFemale06/01/81Ahmedabad University50005000.036 months0.079156.46AA4RENT36000.0Source Verified0wedding852xxAZ11.2003079635632.210000
6Ainslie Rainardarainard6@virginia.eduFemale07/01/81NaN70007000.060 months0.160170.08CC5RENT47004.0Not Verified0debt_consolidation280xxNC23.510101772610137.840010
7Emmott Hambyehamby7@prnewswire.comMale08/01/81Institute of Business Management30003000.036 months0.186109.43EE1RENT48000.0Source Verified0car900xxCA5.3502082213939.135294
8Shem Toomerstoomer8@home.plMale09/01/81Osaka University of Education56005600.060 months0.213152.39FF2OWN40000.0Source Verified1small_business958xxCA5.550205210647.500000
9Giana Aberhartgaberhart9@mozilla.comFemale10/01/81American Public University53755350.060 months0.127121.45BB5RENT15000.0Verified1other774xxTX18.0800092791484.590000

Last rows

nameemail_idgenderdt_applieduniversityloan_amntfunded_amnt_invtermint_rateinstallmentgradesub_gradehome_ownershipannual_incverification_statusloan_writeoffpurposezip_codeadd_statedtidelinq_2yrsinq_last_6mthspub_recrevol_baltotal_paymnt
3943Merla Thebemthebeq7@cocolog-nifty.comFemale21/10/91North Eastern Hill University60006000.036 months0.163211.81DD1RENT39564.0Verified1debt_consolidation606xxIL23.7821020283388.960000
3944Marcellina Dinnegesmdinnegesq8@infoseek.co.jpFemale22/10/91Universidade Católica de Santos24002400.036 months0.11779.39BB3RENT39800.0Not Verified0other303xxGA14.32000154972836.660516
3945Way Symondswsymondsq9@mlb.comMale23/10/91American International University West Africa2500025000.060 months0.183638.25DD5MORTGAGE156000.0Source Verified0house944xxCA5.850001070937936.750000
3946Ailene MatejkaNaNFemale24/10/91Kaya University2000020000.036 months0.117661.52BB3RENT80700.0Verified0debt_consolidation946xxCA13.67010721123406.523000
3947Samuel OverelNaNMale25/10/91Northwestern University1200012000.060 months0.183306.36DD5MORTGAGE34000.0Not Verified1debt_consolidation177xxPA12.5600061149667.950000
3948Corbie Creeboeccreeboeqc@sitemeter.comMale26/10/91Shaheed Rajaei Teacher Training University1200012000.036 months0.135407.17CC1RENT125000.0Source Verified0wedding086xxNJ13.180104628614657.917650
3949Bobbe Ochterloniebochterlonieqd@ezinearticles.comFemale27/10/91Dhofar University1500015000.036 months0.124501.23BB4RENT72000.0Verified0debt_consolidation104xxNY7.470101214716729.253640
3950Corella Espositocespositoqe@macromedia.comFemale28/10/91University of Jan Evangelista Purkyne1200012000.036 months0.060365.23AA1OWN48000.0Not Verified0debt_consolidation365xxAL23.350002238513148.137860
3951Prince Dibdinpdibdinqf@businessinsider.comMale29/10/91College in Sládkovičovo1500015000.060 months0.160364.46CC5RENT50000.0Verified1debt_consolidation907xxCA18.26010979910883.540000
3952Georgette Warrattgwarrattqg@java.comFemale30/10/91Technical University of Lublin1500014975.060 months0.153358.98CC4MORTGAGE32976.0Not Verified1debt_consolidation177xxPA17.90010795611704.260000